The Importance of the Difference in Text Types to Keyword Extraction: Evaluating a Mechanism
نویسندگان
چکیده
Information exists in every aspect of our life. The expansion of the web has helped to this direction. The web feeds us with enormous information and the widespread use of computers and other hardware appliances has lead us to a state where we have a lot of information in our hands, but many times it is useless. People are not able to find information that they really need but already own. How many times have you tried to find a specific article that you have, or a specific mail that you received, or even an SMS from someone saying something specific. For this reason many information retrieval techniques have been proposed and many information extraction mechanisms have been created. In this paper we will provide the experimental evaluation of a keyword extraction mechanism and how we treat different types of text (news articles, publications, e-mails). This keyword extraction mechanism is a part of a complete system that includes information retrieval, information extraction, categorization and publication of information to a personalized portal.
منابع مشابه
Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation
Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...
متن کاملImproving Precision of Keywords Extracted From Persian Text Using Word2Vec Algorithm
Keywords can present the main concepts of the text without human intervention according to the model. Keywords are important vocabulary words that describe the text and play a very important role in accurate and fast understanding of the content. The purpose of extracting keywords is to identify the subject of the text and the main content of the text in the shortest time. Keyword extraction pl...
متن کاملThe Fractal Patterns of Words in a Text: A Method for Automatic Keyword Extraction
A text can be considered as a one dimensional array of words. The locations of each word type in this array form a fractal pattern with certain fractal dimension. We observe that important words responsible for conveying the meaning of a text have dimensions considerably different from one, while the fractal dimensions of unimportant words are close to one. We introduce an index quantifying the...
متن کاملRobust Model for Text Extraction from Complex Video Inputs Based on SUSAN Contour Detection and Fuzzy C Means Clustering
The proposed system introduces a novel approach for extracting text effectively from different types of complex video inputs. The valuable information within the text can be deployed for text indexing and localization. The proposed system uses contour based protocol like SUSAN algorithm for evaluating the contour detection. The system then explores candidate text area and refines the edges by F...
متن کاملTEXTUAL AND INTER-TEXTUAL ANALYSES OF IRANIAN EFL UNDERGRADUATES’ TYPES OF ENGLISH READING TOWARDS DEVELOPING A CAREFUL READING FRAMEWORK
This study investigated textual and inter-textual reading of a group of Iranian EFL undergraduates’ careful English reading types. In this research, Khalifa and Weir’s (2009) reading framework was used to propose a more inclusive aspect of a careful reading framework and the reading construct for instructional and assessment goals. The participants of this study were B.A. students of English Tr...
متن کامل